An online audio indexing system

نویسندگان

  • Jitendra Ajmera
  • Iain McCowan
  • Hervé Bourlard
چکیده

This paper presents overview of an online audio indexing system, which creates a searchable index of speech content embedded in digitized audio files. This system is based on our recently proposed offline audio segmentation techniques. As the data arrives continuously, the system first finds boundaries of the acoustically homogenous segments. Next, each of these segments is classified as speech, music or mixture classes, where mixtures are defined as regions where speech and other non-speech sounds are present simultaneously and noticeably. The speech segments are then clustered together to provide consistent speaker labels. The speech and mixture segments are converted to text via an ASR system. The resulting words are time-stamped together with other metadata information (speaker identity, speech confidence score) in an XML file to rapidly identify and access target segments. In this paper, we analyze the performance at each stage of this audio indexing system and also compare it with the performance of the corresponding offline modules.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Online System for Automatic Annotation of Audio Documents

This article presents a system for automatic transcription of audio documents. The system includes online implementations of recent algorithms for audio segmentation, speech/nonspeech classification, and speaker clustering, and integrates them with large vocabulary speech recognition systems for both English and French. We also propose a segment-based speech confidence score, and demonstrate th...

متن کامل

Pii: S0031-3203(01)00061-9

Broadcasters are demonstrating interest in building digital archives of their assets for reuse of archive materials for 7 TV programs or on-line availability. This requires tools for video indexing and retrieval by content. E2ective indexing by content of videos is based on the association of high-level information associated with visual data. In this paper a 9 system is presented that enables ...

متن کامل

Experimental Results in Audio Indexing

In this paper we describe the IBM Audio-Indexing System and present some experimental results on the performance of the system on an audio indexing task.

متن کامل

A Framework for Indexing Personal Videoconference

The rapid technical advance of multimedia communication has enabled more and more people to enjoy videoconferences. Traditionally, the personal videoconference is either not recorded or only recorded as ordinary audio and video files that only allow linear access. Moreover, in addition to video and audio channels, other videoconferencing channels, including text chat, file transfer, and whitebo...

متن کامل

Online speaker adaptation and tracking for real-time speech recognition

This paper describes a low-latency online speaker adaptation framework. The main objective is to apply fast speaker adaptation to a real-time (RT) large vocabulary continuous speech recognition (LVCSR) engine. In this framework, speaker adaptation is performed on speaker turns generated by online speaker change detection and speaker clustering. To maximize long-term system performance, the adap...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004